Read My Lips: Continuous Signer Independent Weakly Supervised Viseme Recognition
نویسندگان
چکیده
This work presents a framework to recognise signer independent mouthings in continuous sign language, with no manual annotations needed. Mouthings represent lip-movements that correspond to pronunciations of words or parts of them during signing. Research on sign language recognition has focused extensively on the hands as features. But sign language is multi-modal and a full understanding particularly with respect to its lexical variety, language idioms and grammatical structures is not possible without further exploring the remaining information channels. To our knowledge no previous work has explored dedicated viseme recognition in the context of sign language recognition. The approach is trained on over 180.000 unlabelled frames and reaches 47.1% precision on the frame level. Generalisation across individuals and the influence of context-dependent visemes are analysed.
منابع مشابه
Weakly Supervised Metric Learning towards Signer Adaptation for Sign Language Recognition
In this paper, we introduce metric learning into Sign Language Recognition(SLR) for the first time and propose a signer adaption framework to address signer-independent SLR. For adapting the general model to the new signer, both clustering and manifold constraints are considered in the adaptive distance metric optimization. The contribution of our work mainly lies in three-folds. Firstly, a Wea...
متن کاملA Chinese sign language recognition system based on SOFM/SRN/HMM
In sign language recognition (SLR), the major challenges now are developing methods that solve signer-independent continuous sign problems. In this paper, SOFM/HMM is first presented for modeling signer-independent isolated signs. The proposed method uses the self-organizing feature maps (SOFM) as different signers’ feature extractor for continuous hidden Markov models (HMM) so as to transform ...
متن کاملPrimary research on the viseme system in Standard Chinese
The study of traditional phonetics indicates the shape of lips takes important effect on the articulations of consonants and vowels. [1]. AVSP (Audio-Visual Speech Processing) can improve the naturalness of synthetical speech and recognition rate of the speech recognition system. Especially in computer-synthesized face, the movements of lip-shape play a crucial role. The present research aims t...
متن کاملLip Localization and Viseme Recognition from Video Sequences
Viseme (visual cue) recognition is one of the steps to be followed in building an automated lip-reading system. In order to recognize a viseme, one has to first detect the lips of the speaker from the video sequences and track them to extract the feature vectors for the final recognition. A novel method for liplocalization based on the color models has been proposed. Also, the basic possible li...
متن کاملViseme recognition using multiple feature matching
In this paper, we present a technique for the extraction of the five main visemes produced in natural speech for German. The method belongs to the LDA (Linear Discriminant Analysis) family. The intensity, the edges, and the line segments are used to locate the lips automatically and for viseme classification. Using many features in the recognition maximizes the probability of recognition rate. ...
متن کامل